Skip to main content

View Post [edit]

Poster: danigalv Date: Sep 23, 2020 3:53pm
Forum: audio Subject: How are asr.js files created?

The asr.js file format seems to correspond to the filetype "ASR" when searching in the index. It contains both a transcript and timestamps. Based on this page, https://archive.org/help/derivatives.php, it appears to be generated only when an mp3 file is uploaded.

Some colleagues and I began wondering how files are generated, because their timestamps are incredibly accurate from what we've observed.

First of all, it looks like it transcribes only in English. For example, this Spanish audio has a nonsensical English transcript: https://archive.org/details/Montgomery_al_D_a_Episode_151_April_21_2015

1) Can someone describe to me what speech recognizer software was used to generate this file? For example, does it come from some sort of third party speech transcription API? (Like google or IBM or rev.ai). Does it handle out-of-vocabulary words? Note that I am a speech recognition expert, so the more technical detail, the better.

2) We also wonder if there is some sort of documentation or schema for this json file format. Mostly, we have just been looking at it ad-hoc.